drum track
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Robot Drummer: Learning Rhythmic Skills for Humanoid Drumming
Shahid, Asad Ali, Braghin, Francesco, Roveda, Loris
Humanoid robots have seen remarkable advances in dexterity, balance, and locomotion, yet their role in expressive domains such as music performance remains largely unexplored. Musical tasks, like drumming, present unique challenges, including split-second timing, rapid contacts, and multi-limb coordination over performances lasting minutes. In this paper, we introduce Robot Drummer, a humanoid capable of expressive, high-precision drumming across a diverse repertoire of songs. We formulate humanoid drumming as sequential fulfillment of timed contacts and transform drum scores into a Rhythmic Contact Chain. To handle the long-horizon nature of musical performance, we decompose each piece into fixed-length segments and train a single policy across all segments in parallel using reinforcement learning. Through extensive experiments on over thirty popular rock, metal, and jazz tracks, our results demonstrate that Robot Drummer consistently achieves high F1 scores. The learned behaviors exhibit emergent human-like drumming strategies, such as cross-arm strikes, and adaptive stick assignments, demonstrating the potential of reinforcement learning to bring humanoid robots into the domain of creative musical performance. Project page: robotdrummer.github.io
- North America > United States > California (0.04)
- Europe > Switzerland (0.04)
- Europe > Italy > Lombardy > Milan (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
JukeDrummer: Conditional Beat-aware Audio-domain Drum Accompaniment Generation via Transformer VQ-VAE
Wu, Yueh-Kao, Chiu, Ching-Yu, Yang, Yi-Hsuan
This paper proposes a model that generates a drum track in the audio domain to play along to a user-provided drum-free recording. Specifically, using paired data of drumless tracks and the corresponding human-made drum tracks, we train a Transformer model to improvise the drum part of an unseen drumless recording. We combine two approaches to encode the input audio. First, we train a vector-quantized variational autoencoder (VQ-VAE) to represent the input audio with discrete codes, which can then be readily used in a Transformer. Second, using an audio-domain beat tracking model, we compute beat-related features of the input audio and use them as embeddings in the Transformer. Instead of generating the drum track directly as waveforms, we use a separate VQ-VAE to encode the mel-spectrogram of a drum track into another set of discrete codes, and train the Transformer to predict the sequence of drum-related discrete codes. The output codes are then converted to a mel-spectrogram with a decoder, and then to the waveform with a vocoder. We report both objective and subjective evaluations of variants of the proposed model, demonstrating that the model with beat information generates drum accompaniment that is rhythmically and stylistically consistent with the input audio.
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Generating Coherent Drum Accompaniment With Fills And Improvisations
Dahale, Rishabh, Talwadker, Vaibhav, Rao, Preeti, Verma, Prateek
Creating a complex work of art like music necessitates profound creativity. With recent advancements in deep learning and powerful models such as transformers, there has been huge progress in automatic music generation. In an accompaniment generation context, creating a coherent drum pattern with apposite fills and improvisations at proper locations in a song is a challenging task even for an experienced drummer. Drum beats tend to follow a repetitive pattern through stanzas with fills or improvisation at section boundaries. In this work, we tackle the task of drum pattern generation conditioned on the accompanying music played by four melodic instruments: Piano, Guitar, Bass, and Strings. We use the transformer sequence to sequence model to generate a basic drum pattern conditioned on the melodic accompaniment to find that improvisation is largely absent, attributed possibly to its expectedly relatively low representation in the training data. We propose a novelty function to capture the extent of improvisation in a bar relative to its neighbors. We train a model to predict improvisation locations from the melodic accompaniment tracks. Finally, we use a novel BERT-inspired in-filling architecture, to learn the structure of both the drums and melody to in-fill elements of improvised music.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > India > Karnataka > Bengaluru (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
DrumNet
Sony CSL Paris develops technology for AI-assisted music production. The goal is not to replace musicians, but to provide them with better tools to be more efficient in realizing their creative ideas. DrumNet is based on an artificial neural network which learns rhythmic relationships between different instruments and encodes these relationships in a 16-dimensional style space. A similar example is the Logic Pro X Drummer, allowing the user to specify the playing style by navigating a two-dimensional space. The difference of DrumNet to the Logic Pro X Drummer, however, is that it dynamically adapts to the existing music.
Deep Unsupervised Drum Transcription
We introduce DrummerNet, a drum transcription system that is trained in an unsupervised manner. DrummerNet does not require any ground-truth transcription and, with the data-scalability of deep neural networks, learns from a large unlabeled dataset. In DrummerNet, the target drum signal is first passed to a (trainable) transcriber, then reconstructed in a (fixed) synthesizer according to the transcription estimate. By training the system to minimize the distance between the input and the output audio signals, the transcriber learns to transcribe without ground truth transcription. Our experiment shows that DrummerNet performs favorably compared to many other recent drum transcription systems, both supervised and unsupervised.
- Europe > Portugal > Aveiro > Aveiro (0.04)
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > North Yorkshire > York (0.04)
- (7 more...)
- Media > Music (0.47)
- Leisure & Entertainment (0.47)
LSTMetallica:… « Another Word For It
LSTMetallica: Generation drum tracks by learning the drum tracks of 60 Metallica songs by Keunwoo Choi. Word-RNN (LSTM) on Keras with wordified text representations of Metallica's drumming midi files, which came from midiatabase.com. I mention this in part to inject some variety into the machine learning resources I have mentioned. The failures of machine learning for recommendations can be amusing. For the most part when it works, they are rather dull.